A novel algorithm for rapid speaker adaptation based on structural maximum likelihood eigenspace mapping
نویسندگان
چکیده
In this paper, we propose a novel algorithm for rapid speaker adaptation based on our Structural Maximum Likelihood Eigenspace Mapping (SMLEM). The proposed method constructs a binary-tree structured hierarchical Speaker Independent (SI) eigenspace at different levels from well-trained SI system models, and then dynamically constructs a new set of speaker dependent (SD) eigenspaces at corresponding levels, according to the availability of incoming adaptation data. By mapping the mixture Gaussian components from a SI eigenspace to SD eigenspaces in a maximum likelihood manner, the SI models are adapted towards SD models (EM algorithm is used to derive the eigenspace bias). Compared with conventional MLLR, the proposed algorithm is both computationally cheaper and more effective when only a very small amount (from 5 to 15 seconds) of adaptation data is available. In our simulations using the DARPA WSJ Spoke3 corpus, an average of 10.5% relative reduction in WER was achieved over MLLR adaptation when using 5 seconds data for adaptation.
منابع مشابه
Improved Structural Maximum Mapping for Rapid Speak
In this paper, we expand on a previously proposed algorithm entitled Structural Maximum Likelihood Eigenspace Mapping (SMLEM) [5, 6] for rapid speaker adaptation by exploring a variety of model clustering methods and incorporating a multi-stream approach. The SMLEM algorithm directly adapts speaker independent acoustic models to a test speaker by mapping the mixture Gaussian components from a s...
متن کاملFast speaker adaptation using eigenspace-based maximum likelihood linear regression
This paper presents an eigenspace-based fast speaker adaptation approach which can improve the modeling accuracy of the conventional maximum likelihood linear regression (MLLR) techniques when only very limited adaptation data is available. The proposed eigenspace-based MLLR approach was developed by introducing a priori knowledge analysis on the training speakers via PCA, so as to construct an...
متن کاملEigenspace-based speaker adaptation methods in Persian speech recognition systems
Among speaker adaptation algorithms, eigenvoice (EV) and eigenspace-based MLLR (EMLLR) adaptation approaches have been proposed for rapid adaptation with very limited adaptation data. In these methods, a speaker adapted model is constrained to be a weighted combination of some orthogonal basis vectors. In this manner, both the number of parameters to be estimated from the adaptation data, and t...
متن کاملImproving robustness of MLLR adaptation with speaker-clustered regression class trees
We introduce a strategy for modeling speaker variability in speaker adaptation based on maximum likelihood linear regression (MLLR). The approach uses a speaker clustering procedure that models speaker variability by partitioning a large corpus of speakers in the eigenspace of their MLLR transformations and learning clusterspecific regression class tree structures. We present experiments showin...
متن کاملSpeaker clustered regression-class trees for MLLR adaptation
A speaker clustering algorithm is presented that is based on an eigenspace representation of Maximum Likelihood Linear Regression (MLLR) transformations and is used for training cluster-dependent regression-class trees for MLLR adaptation. It is shown that significant automatic speech recognition (ASR) system performance gains are possible by choosing the best regression-class tree structure fo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001